Statistically validated hierarchical clustering: Nested partitions in hierarchical trees

نویسندگان

چکیده

We develop an algorithm that is fast and scalable in the detection of a nested partition extracted from dendrogram obtained hierarchical clustering multivariate series. Our provides p-value for each clade observed tree. The by computing many bootstrap replicas dissimilarity matrix performing statistical test on difference between associated with given its parent node. prove efficacy our set benchmarks generated hierarchically factor model. compare results those Pvclust. Pvclust widely-used pursuing global approach originally developed context phylogenetic studies. In numerical experiments, we focus role multiple hypothesis correction robustness algorithms to inaccuracies errors datasets. verify much faster than has better scalability both number elements records investigated set. also apply two empirical datasets, one related biological complex system other financial time-series. clusters detected methodology are meaningful respect some consensus partitioning

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Hierarchical clustering in minimum spanning trees.

The identification of clusters or communities in complex networks is a reappearing problem. The minimum spanning tree (MST), the tree connecting all nodes with minimum total weight, is regarded as an important transport backbone of the original weighted graph. We hypothesize that the clustering of the MST reveals insight in the hierarchical structure of weighted graphs. However, existing theori...

متن کامل

Dependent nonparametric trees for dynamic hierarchical clustering

Hierarchical clustering methods offer an intuitive and powerful way to model a wide variety of data sets. However, the assumption of a fixed hierarchy is often overly restrictive when working with data generated over a period of time: We expect both the structure of our hierarchy, and the parameters of the clusters, to evolve with time. In this paper, we present a distribution over collections ...

متن کامل

HIERARCHICAL DATA CLUSTERING MODEL FOR ANALYZING PASSENGERS’ TRIP IN HIGHWAYS

One of the most important issues in urban planning is developing sustainable public transportation. The basic condition for this purpose is analyzing current condition especially based on data. Data mining is a set of new techniques that are beyond statistical data analyzing. Clustering techniques is a subset of it that one of it’s techniques used for analyzing passengers’ trip. The result of...

متن کامل

Hierarchical Clustering of Trees: Algorithms and Experiments

We focus on the problem of experimentally evaluating the quality of hierarchical decompositions of trees with respect to criteria relevant in graph drawing applications. We suggest a new family of tree clustering algorithms based on the notion of t-divider and we empirically show the relevance of this concept as a generalization of the ideas of centroid and separator. We compare the t-divider b...

متن کامل

Improved initialisation of model-based clustering using Gaussian hierarchical partitions

Initialisation of the EM algorithm in model-based clustering is often crucial. Various starting points in the parameter space often lead to different local maxima of the likelihood function and, so to different clustering partitions. Among the several approaches available in the literature, model-based agglomerative hierarchical clustering is used to provide initial partitions in the popular mc...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Physica D: Nonlinear Phenomena

سال: 2022

ISSN: ['1872-8022', '0167-2789']

DOI: https://doi.org/10.1016/j.physa.2022.126933